Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QBFT: Fix validation of proposal based on older round's prepared block #7875

Merged
merged 7 commits into from
Nov 25, 2024

Conversation

pullurib
Copy link
Contributor

@pullurib pullurib commented Nov 11, 2024

PR description

From the logs, this particular occurence of the bug happened when transaction selection reached the time limit and then the round ran out of time before commit.

14:35:35.191+00:00 | BftProcessorExecutor-QBFT-0 | WARN  | BlockTransactionSelector | Interrupting transaction selection since it is taking more than the max configured time of 3000ms
14:35:35.191+00:00 | BftProcessorExecutor-QBFT-0 | DEBUG | TransactionSelectionResults | Selection stats: Totals[Evaluated=9, Selected=9, NotSelected=0, Discarded=0]; Detailed[]

There was a block proposed and prepared but not committed so it was propagated to new round .

14:35:36.512+00:00 | BftProcessorExecutor-QBFT-0 | DEBUG | QbftRound | Sending prepare message. round=ConsensusRoundIdentifier{Sequence=804237, Round=1}
14:35:38.468+00:00 | BftProcessorExecutor-QBFT-0 | DEBUG | QbftRound | Received a prepare message. round=ConsensusRoundIdentifier{Sequence=804237, Round=1}. author=0x7f8843b0d49735fa174f202e7f18cc5f3142625f
14:35:39.957+00:00 | BftProcessorExecutor-QBFT-0 | DEBUG | QbftRound | Received a prepare message. round=ConsensusRoundIdentifier{Sequence=804237, Round=1}. author=0xbd8c15e765f79f9fb1c5944e29ffcdb964170528
14:35:39.957+00:00 | BftProcessorExecutor-QBFT-0 | DEBUG | QbftRound | Sending commit message. round=ConsensusRoundIdentifier{Sequence=804237, Round=1}
14:35:39.961+00:00 | BftProcessorExecutor-QBFT-0 | DEBUG | QbftRound | Received a commit message. round=ConsensusRoundIdentifier{Sequence=804237, Round=1}. author=0xbd8c15e765f79f9fb1c5944e29ffcdb964170528
14:35:40.189+00:00 | BftProcessorExecutor-QBFT-0 | DEBUG | QbftBlockHeightManager | Round has expired, creating PreparedCertificate and notifying peers. round=ConsensusRoundIdentifier{Sequence=804237, Round=1}

The new proposer identifies the Round Change (RC) message with the highest prepared round and bases a new proposal on the attached prepared message for that highest round. The proposal’s block header reflects the current round number, but the prepared message still references the prior round. In the code, an attempt is made to match the proposal’s block header round with the highest prepared round for validation by comparing the hashes of the proposed block and the underlying prepared block. However, the hashing method in BlockHeaderFunctions used is incorrect, as it replaces the round with 0.

As a result, validators continue sending RC messages with their already prepared blocks, causing this error to repeat and stall the network, preventing consensus progress. This issue occurs only when the highest prepared round is greater than 0—meaning consensus has moved to round 2 or beyond, using a prepared message from round 1 or later.

14:37:34.878+00:00 | BftProcessorExecutor-QBFT-0 | DEBUG | QbftBlockHeightManager | Received sufficient RoundChange messages to change round to targetRound=ConsensusRoundIdentifier{Sequence=804237, Round=5}
14:37:34.878+00:00 | BftProcessorExecutor-QBFT-0 | DEBUG | QbftRound | Sending proposal from PreparedCertificate. round=ConsensusRoundIdentifier{Sequence=804237, Round=5}
14:37:36.211+00:00 | BftProcessorExecutor-QBFT-0 | INFO  | ProposalValidator | Invalid Proposal Payload: Latest Prepared Metadata blockhash does not align with proposed block

So made change to use the right hashing method which doesn’t touch the round number.

I was able to repro the error by forcing one of the nodes to not send commit message in round 0 and round 1 .

Fixed Issue(s)

#6732 and may be #6613 and may be #6053

Signed-off-by: Bhanu Pulluri <bhanu.pulluri@kaleido.io>
Copy link
Contributor

@matthew1001 matthew1001 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it easy to write a test for this? Would be nice to add a regression test if possible. Will add an approval once I've ready through the forCommittedSeal vs forOnchainBlcok code a bit more thoroughly.

@matthew1001
Copy link
Contributor

Just reading through the code and the detailed explanation you've outlined in the PR, the one thing I wanted to check related to this:

The new proposer identifies the Round Change (RC) message with the highest prepared round and bases a new proposal on the attached prepared message for that highest round. The proposal’s block header reflects the current round number, but the prepared message still references the prior round.

Your fix sounds like it address this by effectively ignoring the round number. I'm wondering if there's an alternative fix which is to make sure the round number in the proposal message is correct for the new round?

@jframe I wonder if you've got time to take a look at this PR that Bhanu has raised to see what you think?

@pullurib
Copy link
Contributor Author

Is it easy to write a test for this? Would be nice to add a regression test if possible. Will add an approval once I've ready through the forCommittedSeal vs forOnchainBlcok code a bit more thoroughly.

Yes I was thinking about the test , will check if it's relatively easy to add one.

@pullurib
Copy link
Contributor Author

Just reading through the code and the detailed explanation you've outlined in the PR, the one thing I wanted to check related to this:

The new proposer identifies the Round Change (RC) message with the highest prepared round and bases a new proposal on the attached prepared message for that highest round. The proposal’s block header reflects the current round number, but the prepared message still references the prior round.

Your fix sounds like it address this by effectively ignoring the round number. I'm wondering if there's an alternative fix which is to make sure the round number in the proposal message is correct for the new round?

@jframe I wonder if you've got time to take a look at this PR that Bhanu has raised to see what you think?

That's true, I got it incorrect about inclusion of round number . Currently (before this PR) prepared block and proposal both don't include round number in hash calculation . So , as far as validation goes , it should only be checked if prepared and proposed blocks hashes are equal (without any round number replacement) , which happens to be the case based on my experimentation. Prepared round and proposal round info are in metadata .

Bhanu Pulluri added 2 commits November 14, 2024 01:05
Signed-off-by: Bhanu Pulluri <bhanu.pulluri@kaleido.io>
Signed-off-by: Bhanu Pulluri <bhanu.pulluri@kaleido.io>
@pullurib
Copy link
Contributor Author

Made changes to fix the other way as proposed above - adding round numbers to new block proposals and updating the round number proposals based on prepared blocks to current round. updates are made in the block header .

@pullurib
Copy link
Contributor Author

Tested and verified the fix manually , will continue looking into adding a testcase

- Check the result of proposer's self proposal validation before sending prepare message
- Update roundchange testcases to include round numbers in blocks

Signed-off-by: Bhanu Pulluri <bhanu.pulluri@kaleido.io>
@pullurib
Copy link
Contributor Author

Figured that there's already a testcase which could have caught this error -

public void proposalMessageContainsBlockOnWhichPeerPrepared() {

It throws the validation error in logs

2024-11-14 16:02:16.999-05:00 | Test worker | INFO  | RoundChangeManager | Address: 0xb02c0093e94b417620f24d8857d6a1dd3d230272  Round: 4
2024-11-14 16:02:17.013-05:00 | Test worker | INFO  | RoundTimer | BFT round 3 expired. Moved to round 4 which will expire in 192 seconds
2024-11-14 16:02:17.029-05:00 | Test worker | INFO  | ProposalValidator | Invalid Proposal Payload: Latest Prepared Metadata blockhash does not align with proposed block
Task :consensus:qbft:integrationTest 

but doesn’t fail due to another bug in the code - while proposer handles their own proposal, it doesn’t check if the proposal is valid before sending prepare. Just fixing that made the above test fail. Then included the change to add round number in block header extradata, corresponding test case changes and verified all the testcases in consensus/qbft/src/integration-test/java/org/hyperledger/besu/consensus/qbft/test pass.

@matthew1001
Copy link
Contributor

This feels like a better fix to me @pullurib, thanks for tweaking it and for the extra tests.

@macfarla
Copy link
Contributor

Since this fixes a number of issues, think it's worth adding a changelog entry @pullurib

Signed-off-by: Bhanu Pulluri <bhanu.pulluri@kaleido.io>
@pullurib
Copy link
Contributor Author

Since this fixes a number of issues, think it's worth adding a changelog entry @pullurib

Done, thanks

@macfarla macfarla enabled auto-merge (squash) November 25, 2024 20:33
@macfarla macfarla merged commit f64c147 into hyperledger:main Nov 25, 2024
43 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants